Andrew Zammit Mangion
09 October 2014
Environmental systems are characterised by phenomena which evolve both in space and in time (cloud modelling, weather and climate, glacial flow, spread of invasive species etc.).
The aim of spatio-temporal modelling is to
Divide an conquer approach to a complext problem.
Two approaches:
General-purpose model, used when the dynamic structure is largely unknown.
where
Continuous-time ST processes are computationally tedious to deal with.
Focus on models of the form
where
where
The SPDE
has as solution (Whittle, 1954)
where
Non-stationarity can be handled by making the SPDE spatially varying. This results in a covariance function which is not a Matern field, but do we care?
We can model SPDEs on manifolds, not restricted to
SPDEs can be approximated to arbitrary precision using an appropriate basis (Lindgren, 2011).
For
(Lindgren, 2014)
Useful when we have multiple interacting spatio-temporal processes
Consider the bi-variate sub-system of random forcings:
where
SPDEs are a useful way to modelling complex phenomena.
Real power comes in using finite elements to decompose and represent them as GMRFs
From now on we restrict ourselves to the linear, Gaussian case
Collapse all variates, and their spatial and temporal components into one big GMRF
Sparsity is retained and one can make use of fill-in reduction permutations when computing the Cholesky factor (Knorr-Rue and Held, 2002)
Lemma 1: Takahashi Equations (Erisman and Tinney, 1975) Let
L=chol(Q) . IfLij≠0 , thenΣij can be computed using elements in
L andΣ∗IJ,I={i′;i′>i},J={j′;j′>j}
Theorem 1: Let
y=Cx+v andv∼N(0,T−1) . Then if
zeros(A)⊆zeros(C) A is non-negativeT is diagonal, thendiag(AΣ∗AT)≡diag(AΣ∗00AT)
Lemma 2: If
A is non-negative then[ATA]jk=0⇔diag(AΣ∗AT) does not depend onΣ∗jk
Asub <- getC(e[[1]])[1:5000,]
system.time({ est <-diag( Asub %*% Results$Partial_Cov %*% t(Asub)) })
> user system elapsed
> 5.232 0.465 5.698
Lp <- Results$Qpermchol
P <- Results$P
system.time({X <- solve(Lp, t(P)%*%t(Asub))
est <- crossprod(X,X)} )
> user system elapsed
> 221.250 0.382 221.688
In practice the Takahashi equations are always used with GMRFs in order to obtain the marginal variances. Theorem 2 provides a means to obtain uncertainty on linear combinations essentially for free.
I can provide predictive uncertainties at any point provided that each row of
Theorem 3: Any planar map divided into contigious regions can be coloured using at most FOUR colours (Appel and Haken, 1976)
Theorem 3: Any planar map divided into contigious regions can be coloured using at most FOUR colours (Appel and Haken, 1976)
In typical Bayesian networks, finding
With a spatial network, this is remarkably easy using the Kernighan-Lin algorithm for graph partitioning (Kernighan and Lin, 1970).
Let
Find a partition of two disjoint, equal-size subsets
Setting edge weights = 1, gives a bisection algorithm!
Sampling from the prior guarantees enormous speed-ups using MPI
Observations with large spatial footprints cause the four-colour theorem to fail.
No two adjacent nodes in a super-graph can have the same colour and no two nodes of the same colour can be spanned by the same observation
Multi-variate spatio-temporal statistics has a lot to offer in a large range of environmental problems which remain untapped.
Computational properties of GMRFs allow us to consider major sources of uncertainty with influence on experimental design: e.g. in multivariate spatial problems, coverage is not as important as mixture diversity
Paramter estimation with large-scale spatio-temporal problems is hard but usually possible (INLA, VB, MCMC), but with large-scale multi-variate spatio-temporal under-determined problems considerably harder. Parallel sampling offers a possible solution.
Parallel sampling can also become intractable, what about parallel approximate Bayesian sparse methods? http://arxiv.org/abs/1305.4152
Jonathan Rougier (University of Bristol)
Botond Cseke (University of Edinburgh)
Finn Lindgren (University of Bath)